{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "de050974",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "7ce2807e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "np.random.seed(0)\n",
    "\n",
    "# 定义状态转移概率矩阵P\n",
    "P = [\n",
    "    [0.9, 0.1, 0.0, 0.0, 0.0, 0.0],\n",
    "    [0.5, 0.0, 0.5, 0.0, 0.0, 0.0],\n",
    "    [0.0, 0.0, 0.0, 0.6, 0.0, 0.4],\n",
    "    [0.0, 0.0, 0.0, 0.0, 0.3, 0.7],\n",
    "    [0.0, 0.2, 0.3, 0.5, 0.0, 0.0],\n",
    "    [0.0, 0.0, 0.0, 0.0, 0.0, 1.0],\n",
    "]\n",
    "\n",
    "P = np.array(P)\n",
    "\n",
    "# 定义奖励函数\n",
    "rewards = [-1, -2, -2, 10, 1, 0]\n",
    "# 定义折扣因子\n",
    "gamma = 0.5  \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2736222f",
   "metadata": {},
   "source": [
    "### 为何倒序计算\n",
    "在计算回报时采用倒序计算是因为回报的计算是基于未来奖励的折扣累加。根据回报的定义，从某个状态 $s_t$ 开始的回报 $G_t$ 可以表示为：$G_t = R_{t+1} + \\gamma R_{t+2} + \\gamma^2 R_{t+3} + \\cdots = R_{t+1} + \\gamma G_{t+1}$其中 $R_{i}$ 是在时刻 i 获得的奖励，$\\gamma$ 是折扣因子。可以看到，$G_t$ 的计算依赖于 $G_{t+1}$，即未来的回报。因此，为了正确计算当前状态的回报，需要从序列的末尾开始，依次向前计算每个状态的回报。这样，在计算当前状态的回报时，已经知道了未来状态的回报，从而可以正确地进行折扣累加。例如，对于状态序列 [1, 2, 3, 6]，要计算从状态 1 开始的回报，需要先知道从状态 2 开始的回报，而计算从状态 2 开始的回报又需要知道从状态 3 开始的回报，以此类推。所以倒序计算可以确保在计算每个状态的回报时，所需的未来回报信息已经计算出来了。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "31af10e6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "i的值为5, chanin[i]的值为6, rewards[chain[i] - 1]的值为0\n",
      "G的值更新为0.0\n",
      "i的值为4, chanin[i]的值为5, rewards[chain[i] - 1]的值为1\n",
      "G的值更新为1.0\n",
      "i的值为3, chanin[i]的值为4, rewards[chain[i] - 1]的值为10\n",
      "G的值更新为10.5\n",
      "i的值为2, chanin[i]的值为3, rewards[chain[i] - 1]的值为-2\n",
      "G的值更新为3.25\n",
      "i的值为1, chanin[i]的值为2, rewards[chain[i] - 1]的值为-2\n",
      "G的值更新为-0.375\n",
      "i的值为0, chanin[i]的值为1, rewards[chain[i] - 1]的值为-1\n",
      "G的值更新为-1.1875\n",
      "根据本序列计算得到回报为：-1.1875。\n"
     ]
    }
   ],
   "source": [
    "# 给定一条序列,计算从某个索引（起始状态）开始到序列最后（终止状态）得到的回报\n",
    "def compute_return(start_index, chain, gamma):\n",
    "    '''\n",
    "    该函数接受三个参数：\n",
    "        start_index：表示从状态序列 chain 的哪个索引位置开始计算回报。\n",
    "        chain：是一个状态序列，表示智能体在环境中经历的状态顺序。\n",
    "        gamma：折扣因子。\n",
    "    函数内部初始化回报 G 为 0，然后通过 reversed(range(start_index, len(chain))) 倒序遍历从 start_index 到序列末尾的所有状态。\n",
    "    对于每个状态，将当前状态的奖励加入到回报 G 中，并根据折扣因子 gamma 对之前的回报进行折扣。最后返回计算得到的回报 G。\n",
    "    '''\n",
    "    G = 0\n",
    "    for i in reversed(range(start_index, len(chain))):\n",
    "        print(f\"i的值为{i}, chanin[i]的值为{chain[i]}, rewards[chain[i] - 1]的值为{rewards[chain[i] - 1]}\")\n",
    "        G = gamma * G + rewards[chain[i] - 1]\n",
    "        print(f\"G的值更新为{G}\")\n",
    "    return G\n",
    "\n",
    "\n",
    "# 一个状态序列,s1-s2-s3-s6\n",
    "chain = [1, 2, 3, 6]\n",
    "chain = [1, 2, 3, 4, 5, 6]\n",
    "start_index = 0\n",
    "G = compute_return(start_index, chain, gamma)\n",
    "print(\"根据本序列计算得到回报为：%s。\" % G)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "bb968212",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MRP中每个状态价值分别为\n",
      " [[-2.01950168]\n",
      " [-2.21451846]\n",
      " [ 1.16142785]\n",
      " [10.53809283]\n",
      " [ 3.58728554]\n",
      " [ 0.        ]]\n"
     ]
    }
   ],
   "source": [
    "def compute(P, rewards, gamma, states_num):\n",
    "    ''' 利用贝尔曼方程的矩阵形式计算解析解,states_num是MRP的状态数 '''\n",
    "    # 将rewards写成列向量形式\n",
    "    rewards = np.array(rewards).reshape((-1, 1))\n",
    "    value = np.dot(np.linalg.inv(np.eye(states_num, states_num) - gamma * P), rewards)\n",
    "    return value\n",
    "\n",
    "V = compute(P, rewards, gamma, 6)\n",
    "print(\"MRP中每个状态价值分别为\\n\", V)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "7b35b274",
   "metadata": {},
   "outputs": [],
   "source": [
    "S = [\"s1\", \"s2\", \"s3\", \"s4\", \"s5\"]  # 状态集合\n",
    "A = [\"保持s1\", \"前往s1\", \"前往s2\", \"前往s3\", \"前往s4\", \"前往s5\", \"概率前往\"]  # 动作集合\n",
    "# 状态转移函数\n",
    "P = {\n",
    "    \"s1-保持s1-s1\": 1.0,\n",
    "    \"s1-前往s2-s2\": 1.0,\n",
    "    \"s2-前往s1-s1\": 1.0,\n",
    "    \"s2-前往s3-s3\": 1.0,\n",
    "    \"s3-前往s4-s4\": 1.0,\n",
    "    \"s3-前往s5-s5\": 1.0,\n",
    "    \"s4-前往s5-s5\": 1.0,\n",
    "    \"s4-概率前往-s2\": 0.2,\n",
    "    \"s4-概率前往-s3\": 0.4,\n",
    "    \"s4-概率前往-s4\": 0.4,\n",
    "}\n",
    "# 奖励函数\n",
    "R = {\n",
    "    \"s1-保持s1\": -1,\n",
    "    \"s1-前往s2\": 0,\n",
    "    \"s2-前往s1\": -1,\n",
    "    \"s2-前往s3\": -2,\n",
    "    \"s3-前往s4\": -2,\n",
    "    \"s3-前往s5\": 0,\n",
    "    \"s4-前往s5\": 10,\n",
    "    \"s4-概率前往\": 1,\n",
    "}\n",
    "gamma = 0.5  # 折扣因子\n",
    "MDP = (S, A, P, R, gamma)\n",
    "\n",
    "# 策略1,随机策略\n",
    "Pi_1 = {\n",
    "    \"s1-保持s1\": 0.5,\n",
    "    \"s1-前往s2\": 0.5,\n",
    "    \"s2-前往s1\": 0.5,\n",
    "    \"s2-前往s3\": 0.5,\n",
    "    \"s3-前往s4\": 0.5,\n",
    "    \"s3-前往s5\": 0.5,\n",
    "    \"s4-前往s5\": 0.5,\n",
    "    \"s4-概率前往\": 0.5,\n",
    "}\n",
    "# 策略2\n",
    "Pi_2 = {\n",
    "    \"s1-保持s1\": 0.6,\n",
    "    \"s1-前往s2\": 0.4,\n",
    "    \"s2-前往s1\": 0.3,\n",
    "    \"s2-前往s3\": 0.7,\n",
    "    \"s3-前往s4\": 0.5,\n",
    "    \"s3-前往s5\": 0.5,\n",
    "    \"s4-前往s5\": 0.1,\n",
    "    \"s4-概率前往\": 0.9,\n",
    "}\n",
    "\n",
    "\n",
    "# 把输入的两个字符串通过“-”连接,便于使用上述定义的P、R变量\n",
    "def join(str1, str2):\n",
    "    return str1 + '-' + str2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "0f6a06de",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MDP中每个状态价值分别为\n",
      " [[-1.22555411]\n",
      " [-1.67666232]\n",
      " [ 0.51890482]\n",
      " [ 6.0756193 ]\n",
      " [ 0.        ]]\n"
     ]
    }
   ],
   "source": [
    "gamma = 0.5\n",
    "# 转化后的MRP的状态转移矩阵\n",
    "P_from_mdp_to_mrp = [\n",
    "    [0.5, 0.5, 0.0, 0.0, 0.0],\n",
    "    [0.5, 0.0, 0.5, 0.0, 0.0],\n",
    "    [0.0, 0.0, 0.0, 0.5, 0.5],\n",
    "    [0.0, 0.1, 0.2, 0.2, 0.5],\n",
    "    [0.0, 0.0, 0.0, 0.0, 1.0],\n",
    "]\n",
    "\n",
    "P_from_mdp_to_mrp = np.array(P_from_mdp_to_mrp)\n",
    "R_from_mdp_to_mrp = [-0.5, -1.5, -1.0, 5.5, 0]\n",
    "\n",
    "V = compute(P_from_mdp_to_mrp, R_from_mdp_to_mrp, gamma, 5)\n",
    "print(\"MDP中每个状态价值分别为\\n\", V)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "632a3ad6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "第一条序列\n",
      " [('s1', '前往s2', 0, 's2'), ('s2', '前往s3', -2, 's3'), ('s3', '前往s5', 0, 's5')]\n",
      "第二条序列\n",
      " [('s4', '概率前往', 1, 's4'), ('s4', '前往s5', 10, 's5')]\n",
      "第五条序列\n",
      " [('s2', '前往s3', -2, 's3'), ('s3', '前往s4', -2, 's4'), ('s4', '前往s5', 10, 's5')]\n"
     ]
    }
   ],
   "source": [
    "def sample(MDP, Pi, timestep_max, number):\n",
    "    ''' 采样函数,策略Pi,限制最长时间步timestep_max,总共采样序列数number '''\n",
    "    # S, A, P, R, gamma = MDP：对 MDP 元组进行解包，获取其中的状态集合、动作集合、状态转移概率、奖励函数以及折扣因子。\n",
    "    S, A, P, R, gamma = MDP\n",
    "    # episodes：用于存储所有采样序列的列表。\n",
    "    episodes = []\n",
    "    # for _ in range(number)：循环 number 次，目的是生成指定数量的采样序列。\n",
    "    for _ in range(number):\n",
    "        # episode：用来存储当前正在生成的采样序列。\n",
    "        episode = []\n",
    "        # timestep：记录当前采样序列的时间步。\n",
    "        timestep = 0\n",
    "        # 随机选择一个除s5以外的状态s作为起点\n",
    "        s = S[np.random.randint(4)]\n",
    "        # 当前状态为终止状态或者时间步太长时,一次采样结束\n",
    "        while s !='s5' and timestep <= timestep_max:\n",
    "            # 时间步加 1。\n",
    "            timestep += 1\n",
    "            # 生成一个 0 到 1 之间的随机数 rand，并初始化累加概率 temp 为 0。\n",
    "            rand, temp = np.random.rand(), 0\n",
    "            # 在状态s下根据策略选择动作\n",
    "            # 遍历所有可能的动作，依据策略 Pi 选择一个动作。\n",
    "            for a_opt in A:\n",
    "                # A = [\"保持s1\", \"前往s1\", \"前往s2\", \"前往s3\", \"前往s4\", \"前往s5\", \"概率前往\"]  # 动作集合\n",
    "                # 累加在状态 s 下选择动作 a_opt 的概率。\n",
    "                '''\n",
    "                print(join(s, a_opt))\n",
    "                    s4-保持s1\n",
    "                    s4-前往s1\n",
    "                    s4-前往s2\n",
    "                    s4-前往s3\n",
    "                    s4-前往s4\n",
    "                    s4-前往s5\n",
    "                    s4-概率前往\n",
    "                '''\n",
    "                temp += Pi.get(join(s, a_opt), 0)\n",
    "                # 若累加概率超过随机数 rand，则选择该动作 a_opt 作为当前动作 a，并获取相应的奖励 r。\n",
    "                if temp > rand:\n",
    "                    '''\n",
    "                    print(f'选择动作：{join(s, a_opt)}')\n",
    "                    选择动作：s4-概率前往\n",
    "                    '''\n",
    "                    a = a_opt \n",
    "                    r = R.get(join(s, a), 0)\n",
    "                    break\n",
    "            # 再次生成一个 0 到 1 之间的随机数 rand，并初始化累加概率 temp 为 0。\n",
    "            rand, temp = np.random.rand(), 0\n",
    "            # 遍历所有可能的状态，依据状态转移概率 P 确定下一个状态 s_next。\n",
    "            for s_opt in S:\n",
    "                # S = [\"s1\", \"s2\", \"s3\", \"s4\", \"s5\"]  # 状态集合\n",
    "                '''\n",
    "                print(join(join(s, a), s_opt))\n",
    "                    s4-概率前往-s1\n",
    "                    s4-概率前往-s2\n",
    "                    s4-概率前往-s3\n",
    "                    s4-概率前往-s4\n",
    "                '''\n",
    "                    \n",
    "                temp += P.get(join(join(s, a), s_opt), 0)\n",
    "                # 若累加概率超过随机数 rand，则选择该状态 s_opt 作为下一个状态 s_next。\n",
    "                if temp > rand:\n",
    "                    '''\n",
    "                    print(f'状态选择: {s_opt}')\n",
    "                    状态选择: s4\n",
    "                    '''\n",
    "                    s_next = s_opt\n",
    "                    break\n",
    "            # 将当前状态 s、动作 a、奖励 r 和下一个状态 s_next 组成的元组添加到当前采样序列中。\n",
    "            episode.append((s, a, r, s_next))\n",
    "            # 将下一个状态 s_next 设为当前状态，继续下一轮循环。\n",
    "            s = s_next\n",
    "        \n",
    "        episodes.append(episode)\n",
    "    return episodes\n",
    "\n",
    "# 采样5次,每个序列最长不超过20步\n",
    "episodes = sample(MDP, Pi_1, 20, 5)\n",
    "print('第一条序列\\n', episodes[0])\n",
    "print('第二条序列\\n', episodes[1])\n",
    "print('第五条序列\\n', episodes[4])\n",
    "# print(episodes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "a23a1b8c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "使用蒙特卡洛方法计算MDP的状态价值为\n",
      " {'s1': -1.228923788722258, 's2': -1.6955696284402704, 's3': 0.4823809701532294, 's4': 5.967514743019431, 's5': 0}\n"
     ]
    }
   ],
   "source": [
    "# 对所有采样序列计算所有状态的价值\n",
    "def MC(episodes, V, N, gamma):\n",
    "    '''\n",
    "    episodes：一个包含多个采样序列的列表，每个采样序列是一个包含多个元组 (s, a, r, s_next) 的列表，其中 s 是当前状态，a 是采取的动作，r 是获得的奖励，s_next 是下一个状态。\n",
    "    V：一个字典，用于存储每个状态的价值估计，初始值都为 0。\n",
    "    N：一个字典，用于记录每个状态被访问的次数，初始值都为 0。\n",
    "    gamma：折扣因子，用于衡量未来奖励的重要性，取值范围为 [0, 1]。\n",
    "    '''\n",
    "    for episode in episodes:\n",
    "        G = 0\n",
    "        # 一个序列从后往前计算\n",
    "        for i in range(len(episode) - 1, -1, -1):\n",
    "            (s, a, r, s_next) = episode[i]\n",
    "            G = r + gamma * G\n",
    "            N[s] = N[s] + 1\n",
    "            V[s] = V[s] + (G - V[s]) / N[s]\n",
    "\n",
    "timestep_max = 20\n",
    "# 采样1000次,可以自行修改\n",
    "episodes = sample(MDP, Pi_1, timestep_max, 1000)\n",
    "gamma = 0.5\n",
    "V = {\"s1\": 0, \"s2\": 0, \"s3\": 0, \"s4\": 0, \"s5\": 0}\n",
    "N = {\"s1\": 0, \"s2\": 0, \"s3\": 0, \"s4\": 0, \"s5\": 0}\n",
    "\n",
    "MC(episodes, V, N, gamma)\n",
    "print(\"使用蒙特卡洛方法计算MDP的状态价值为\\n\", V)\n"
   ]
  },
  {
   "attachments": {
    "image.png": {
     "image/png": "iVBORw0KGgoAAAANSUhEUgAAAvgAAACECAIAAABqP8gnAAAgAElEQVR4Ae2dh1sUydq337/tW/q93nOk3bPHtLuuYaVBwYAEEVSSCVCMKCpJjItKEEFRQVxFEUQEQUyYA+hiXIGZjvNZ1WGqunuGHmB0nH285pKe6grPc1d11W+qq6v/xwP/gAAQAAJAAAgAASAQpgT+J0z9AreAABAAAkAACAABIOABoQONAAgAASAABIAAEAhbAiB0wrZqwTEgAASAABAAAkAAhA60ASAABIAAEAACQCBsCYDQCduqBceAABAAAkAACAABEDrQBoAAEAACQAAIAIGwJQBCJ2yrFhwDAkAACAABIAAEQOhAGwACQAAIAAEgAATClgAInbCtWnAMCAABIAAEgAAQAKEDbQAIAAEgAASAABAIWwIgdMK2asExIAAEgAAQAAJAAIQOtAEgAASAABAAAkAgbAmA0AnbqgXHgAAQAAJAAAgAARA60AaAABAAAkAACACBsCUAQidsqxYcAwJAAAgAASAABEDoQBsAAkAACAABIAAEwpYACJ2wrVpwDAgAASAABIAAEAChA20ACAABIAAEgAAQCFsCIHTCtmrBMSAABIAAEAACQACEDrQBIAAEgAAQAAJAIGwJgNAJ26oFx4AAEAACQAAIAAEQOtAGQpnAQG0yxxR2hrKJYBsQAAJAAAiEMgFnQudNfTybXvvGcAQNP/F1A8Z3uwMfQxTKiivosksx8bCuQobl8Adbi76qZiNjfBTqtbOr0FeciVs2iTn48MVcR/YlDtalM8n1g/YnfYcikoXOKq2zgLWVJr7CfRfq8XgctBbkkU2LwtXq2NN3jflzYpLmxCTt7/NrzySedN2t3VecsyppTsLxPslnvvzwx8+fnH2GBZ+5wAkgAASAwD+YQNCEjq8hylf4OOsAj2eauOHMgzHWPUjfvOksSEYCyE6coQEYa6CBrsJ0XRj5swaNrIHOMWCvdRGmSjEn/3tVGmG5jyEcFUGKUXsXVFkwttahcxvsKow307M3A+dvI4lwuL15mkneSuSMWugqtKNEyRcf+inAZmbY4EMNazCNaM6qUvdXEuzESs++3zXvcpqHzBE01WJq3nY0DG4UFvvah1AgAASAwD+QQNCEjq85gABHIOdVggcheoily1JHTUIx4LwpO72Dt80Qq4sbctR3NvLpA54DZ3CGpvjIKsps7BcVok1+UAltXDAGRdsD3UFsJhIQZBEWw+wUBjbMKgLi6+rRNI/NB9UXyhkN0rqbKBPsCF19FquwAeY8NQI+6oVuHkR1GPGDInR8YLEDolPSVIsudHLrnz195utTm4tTjUvoqI6TFU1QgcPwJYB/BEK9h28Fg2cUgUkSOqgrp0YRu1EWR7AZvSiDxv0Fd9mUDdYbH12F6QVd1B03ZCc5wCPzKLmA7cHjDRFNHR5sB0XsOG1GIC7hnE0G6AqAyMeHs1RCZInjwc/MAcsOioyHVja4HumOElNCJXYW0AwNJl2FlHhSHUK+oFS6m1oV4NyQjvHCpL02SzEP6rtVAmYDPKqcIrIiWKJDnLMuMsziCYWrnurRco60t7VrnyMr1PjpR/SQtoZdSVjE6NWBWfmTNdYStYozIPizTcvZcV17fcejnW0z9saBo3AlALUfrjULflkI+BE6aCDROkHz8K+PSWp22pjXOWgs4kEh3vHJ4/F4B10cORh9Kz0KYsvGLstmRLQg8nhsRn3sESF99FSW0Vc/4fAv9kIfILU0NG0t0FIQcpZK6GXuoGyL0FHHfqoSB+vS4ws7tfU9XlWh5U5aTtUFaZhdjaDIFqFTgG4jFnYZ6ge3IXotjoWAbpKeIeU2ZRJ1Bn3BZ/2JCVrokFiQGUhqEI1Bz02vDuy11/i+I3g90BFqOZBtoNrwWI4hVJQhp4yDI+nYgICFjgWgBQsEhDeBgLqI8EYB3oU3gQkLHbufBaijJ/p9TSioHbE61BUW0hHGCVkfUfwNUXa/pLURCCfXRyM8Y+FDgeEf1pRH9gbTGdrH8R9ql4Ot0EFjMzWhgiqCHID9l+PgrJ0oMZKhLpIsDkU2SKJYxspuc2faZa565LJZ6HDxybp3WgOzjsqWEE3ooHBrPZoNNjzBBxg7akUrKoypGuqgawAt9dWj7bjgXSDckqMKna0txjqb/hOpuNXpQAyhU2gslh+jxeo1O9SQi5ZIO/rkNr+jnfL/DfsyqQ3Gf3lwNgQJ+L3GQ9BeMAkIjI+Ac6FDdM2F9eqqER99JRqY6ZGGGJP0Th+NBAH/BvXno50xRLk2SbF88Q7P9gMkTudQ6KAc9LsnNuV51Ls/1vsUhGLAXugDpJaHvdAxF2ASOlgf2Ik8oh4tltC15mvWyqM5Qis/rCT8ZW5jDG4AyGWz0KEIYCbW1mKpXFXo1BXatiuz3qLxaUXYKSQyohHNxhcLTG9L0Nt8QSBC51lzccG+wD/Nz0iDfR9b6BlRdWtVH3XJZZzWa1/3l4pAp8XNyebawRiNKtYtIdOaegbylH4bUTUIZ6XN/OmVYlVv6oWpNU5kML46TK3dPhAVo1lI1L5mPNHmLYWSF6Bjdwgxja3VE9q0XsoFk4WGMWoXpzlu9he5ZlM7ZE37ONaytcvQRwoIBgLflIBzoWN0TPrlQU/Xe73osow0uJ/SrgrjWO289CvZm3y8R3qXR6Y3X8bULIjWExl+oc7Cx6XrzcfbtVks957yqeHsikBmGB2TOmdgmKT6gkqnhhPSRf0YuW8xST9J/sW+OIqJpmW8ozWRB0ZtMpI4TR/a9NF0BGL8MHSSN3P1bEEXQkcbo3Xu3swIawfr0k1V6d8MwwZTKm/m+MiIpo+phsG2B7oX+jitZa41PG+lo7wtgX0HnE3kmOZ7DvSYbLb/SrAiI+i09TDVclLRYjuJ1thZq28zYU1bizYk8F47eqamRo7rMZlsvbiuvU10oDZZJ6mDMqoJF5oe742AiyMuKBWsxWBcBOkXeXvdMFQ7wJG9AksrosC74Mx8TalW6ftxmEzy5w7uQ4yGMVBbqO0EYdN6cUXoHHSGlEe0j+aK03zDpholmj338V31yFdv6SMRBAOBb0dg/EKH6DtI89E1gC4/dF1p1w91LRlCByVSe5BALzOyOO8xVYoWjC9I4+KnpJV2rRJjJzJG7zi82eIjOh/1Foa3I0ZRtK6tS72Do/pF9M5afnZFEKC8+RirnVBCVLoP2lq+6q0i+7uBVIeoDzy08d5czEdmx/F57J3THIjlWebMte8IHcpNdxNVE0ZH1ReKjIcBY2oHmUFhQZ5qzLXqIDBSQ8VQc7ZJH8zWlMqUuRZ5QdwSwtlyDDveW1fqTkS4Rhg2dhZpw9xYLJ5M18LHrirHkzpVPZ99ELYGY0dMZWm7FlFIzcLCwtzImrqujVC9vRnXID5D1w5uTnpfoaXEuZkt0c5RbVKtESom1eB9GkxLCu+0DWm6fowtJBu8pWVSPwmsKKwheta0EPRrLWmAXb0Y/a2aN9XgcZA1BAUTV43XKDgCAuFFYLKFjj6dM/imE+1pi6QD1TFZnoSiz04Art534wzRr3/UjxPXNu6t9A5X62rrvEOjqrrGKXSMvhX1aNpYi3tSk3JCNpiLQGm9Qw49Bqg4kEdUV25DyXccc0eG+Zg6TZsM9SBzcm2ditkLLTqGrN/U8D/tYXiEXDYLHVV2eLHo1ugDJyJmW5ZXXBJVj1JTX9WBypmd5DwZrh16cZJhht60dLVKzD/pxWnQcGvxAcfkstaYp8z2yq+fpmE4M5Z5F+6oKs15nZpo6HCxdyYDaAVgaQx6UrVVWNPq9UXA0fkYNUVdmHqG2HE6lX6Kqkqb6wXT1lqXb4PNHZGfmGoVU8Ygm40GjAzDhar1a4fRoTs+r02q9aogcCvSL0MrQ4uFvjQNYblBGA6AQJgRmGShYyxBVTGpg733eRO6R5g0lPpAggcPowNF2eudDu4IiJFAs5Pq3VAcveMwmWbup1C2em64CL3XQ5Z4DaBOoSztikA2eEcInMSbA7YDlU71qibr7DbRMaLgDL35a78gdeONaH4OqE5WRe0zuY2DVHKtGMojZCHKUA+kGfo2DJVFYaFq0zw/gcwwxiq9wfy0cofvdTA5UaoSIpxVK5RhU1d7V8/o0WJzjKy2rFxGNUWtuNQT1M4Gvj3znsENj+Xm7DqnP2N1bksMFjrEc1ja0+yEnd4M7I9wtgYNPY5dTVFKBUUg2qqeTo9jb4BNWXQjt9QjyhenIjPUq0wTiPopS/OmlJlvg/UidAhU8yAcw4dWCy0h2Dy198CFqkqd/l8vC+Xpwx31RhvyUXdQtQXlSYeoMfX+yt4ejRUl6E19CyXRzH7DdyAQLgQmWehYsNh3WPr1aYkeaAD+WaN2AXZdHtGhmLoJtSBqaESdhQ/DzJ01KgtnqHZq3lSo/6K7EqIHnDyh4715r/th6Qc1kmbLtSGE6vvU7pg2m6oIoxvFuZkctMT00sCnbPpoQ9PgCDpMlDkSLgZD02Cg26yLG8Mq3QKqNlEgbhKaX8gMY6TRc9az0nOg/qL8TUMOzpAevXSr/A0qAzWxKFp6LbqzOWZysiJU4Nxkz+hYWwXy3K6mKEHgIwIVh0KIvtiURdaLelFYKoJMpVaEFwtpBs6K1PEWoWN74WMribSoCFO7JRyxtDT8i4WymbjMSfOITIxDf+6okYxmZhRhkyfu+nSbLRYS9hgF2x84j2mfHkKBwHdAYLKEDuqYjMvS8Fu9YvWrEQcH7boiui2jfNR5m8YqHKL3jNTQ6KezI7tdlDkqK7m+C71lydv/ohPIOzqEsEXt0+1GRN0eemzWk9JsqQ4OR8FIzW5qidVe1Zu/zW9lvRh/f40iTJMo5jRmhtYuG6eg+mUVps2Mjg1MMiF5jHOlalO1zFj4Sdeg7o61xRIOofxNVLv2qTKFXF6zbIoazXovyWgJaiNk19b2jFPoTPaMjr2msb+CiEHdRwTEzGYk1lBi8rTawPkYlwnmbGhQLZW3cq2FkmVZz5LKye4sUcO4GaDeCVUQeY0QcdCh1xj9hCXEyMo7i6zHpf9aTSLdIeOicL0J4WPaQqofsNhjYzOZN3FMWE6EwiEQCCsCkyV0rFeaPpti6sKCdl1ZehDcw7JoRxajvzA0it2ud8gFSpN5K5oeJnWhY/NqTOSd0YN70+tHforQomAvTDmg0tUh2a5PRHmiuQp1NKVHFByYHq8nx2XYDDy6eT7/YqvwGG+qTXMKykFkbWGnanOXuusPrn0s9by9Nsocma27aTA0DryloPx1dUIe4xiIgAmdkZKuQd0MPSsjGnmAwVI3EYSWzRjCtGLi/aZ6NIKMjksz5t2ZtaTLg3VrkTaalrqv56NR3uf+4/F48c2swk5iTbHNPjo2a3TURc3EomkjW18HNgOnfj/FzISkirmZI6hlUOMuVay1LBxi1JQK0NseUGIiN7X9ENcajq+3c4yaTkuO8X4MRsVorQIVQVQfZT36gkqkvbaE4IK03sNvof7doYomIGA3DWIoltrG9P7KYo8qPf2pN70osn71MPgLBMKMQEBCB3cx6JpHC42pK9/m2saR9f7IS43sEbyh3iO6E/SGj3lEdnlqL6Df1aaGdqpToy5yZLDecZhKo4fJ8QsdU7Y2X609mtodxxdaX6updcG6m9rw4P1KSQfDNYqGjQXmIBxf20rAeVpvz2v07Mg1a3vQu2wsBdT5Ev2HLGoqVOdODzneIjSTqdo0uYEs97ZY3AiNFzuYoupfcQOmhM7DI+r6GMoLPRoxUmrNTzNel0feVCP9FVjrsFzsvpb+F3cbdqSq00Jz9pEqx8MP9OhLc4zdC23W6HjjPPIqJ90LH399sFIt914FluvaGqFAf7zcfOV2FdoO/DQcvQ2TnNXa0XnSJeqtUYeJz/oWOnrrIj0yDEZo0MMT6ZZmZoJmaWlW6YNtNkrBKIwrTlVUmpF+3eks0P2yTJLhZqYz0X7VePsrq4X670wiQ09XofcS0F3EppoA6ud8/tVqwfDXZ0Q4AQRCg8AYQocefrzXA7o8jKuO/gWG/cKXJXmNGd7SPYIRrB84H0r1FPpf1IOgEtWL0GsqOu/tOtFZ79VOdffIZh+XLs5T91ftqqgV1roNY926MuL5PMCZ06O7arx5yLfKGpwn8khfr4OOdQ5qJgYf26qxGKX2195M1AhqEToNSyJqZYy5v7arfb3i9JxQHEzAONDPTEDo0COBSmPs5TJYeBmsXhxXlydP2UfuVYObOhaC7x6pcqRlH37eUJtekjoL8FTNT/vvev3wCP1V6T9RBsTGV/R8logoHs+7xnzvo1X6s+g+Z3RikuY43EQHFUIDIYtV61e3zeaK8B3BuDRwv0E0YzJJYSfdyDVLyLTeKxQbprdDVB0FXdRtMpxKb+SaFxbXyNLN1ziuPqOKSQ7eY0uGVnqWhk26Y5LUvt3Bxujkzb8KyEabXD+IndJrx2qhaj2dodEbeF2jejZv8BhHah/rq7ccIzGcBgJfn4C90NGvUqKrMplGXnXqlWmMfOopX32HpUegM/Z1xdKxiG+6qagTNPWPRCy7H46aPjN8REXrHQedlOjX1OJQNNzR+C3RlImjr7gTpDpuVKIZptbR2JfeVYjeSIU5myKQrGgJq02l6PG1/M0Sh/BAz8piqmUbSeQRbb/a0ftATZRhbWO4pelGmjpx1QWjNol8zBWtC19jRPF/oBvfV6I+S8WtbiYnTnQzCjs/N+dQVPEdLr51hzpbU9B8t/ZAccG+wtUJSbNmaMCp+LoZP0UlzVl37q4mm4y5HPXA74xOe1u740kda73QyL7at4Av+Um17NuWPqmujCMzSi2NIz0kAQLfBwF7oTMB28fqOPwLHXTh+RirJmCTz6RUcchyYvQ1Bns8JmmjHQok4qAktmOVKVAfm20NoQuibLCLrw7/hqy0i6LNo4wVxzapqkL8Gkym0wjE1zUhFLomIGP4GFBVr/3WNfLUFIFsXeQxLtBSm2QtUB7pEopY5GvSE23t7fpryXWnPvfXZ0dxzLQdLS7SP70NFHZ6tKer1CYRu+IMfppcelabHsvMLu+TBmpXelvLlLnpq0vq2x8N8a6h/vb6/ZvXztE3LWRYLufKiFoRpAuOjp1XOoZAYSHd+nrHlnr8ekWrs4+UWP+KhX/7onxcm9/eMLAACEwugUkXOhMyz272YkIZQmIgYEPANdDVjpSN+qpOmwgo6GM/jtPeM8AbMaShrh4/m+EIvP6yT568CSUNdfXhVI+ajzS29b34SJ01Mvd4PNLI5zcPu9rvviOTkxEm9RhPy5mk5KQW4Cizbyd0QkXqOcI0+ZGoHwaTnz3kCARCh0BoCZ3Q4QKWAIF/AgE8b/Rttc63EDr6lF4ITGh9o1aGVI7/e/3fyDAoFggEgQAInSBAhSyBABAAAkAACACB0CAAQic06gGsAAJAAAgAASAABIJAAIROEKBClkAACAABIAAEgEBoEAChExr1AFYAASAABIAAEAACQSAAQicIUCFLIAAEgAAQAAJAIDQIgNAJjXoAK4AAEAACQAAIAIEgEAChEwSokCUQAAJAAAgAASAQGgRA6IRGPYAVQAAIAAEgAASAQBAIgNAJAlTIEggAASAABIAAEAgNAiB0QqMewAogAASAABAAAkAgCARA6AQBKmQJBIAAEAACQAAIhAYBEDqhUQ9gBRAAAkAACAABIBAEAiB0ggAVsgQCQAAIAAEgAARCgwAIndCoB7ACCAABIAAEgAAQCAIBEDpBgApZAgEgAASAABAAAqFBAIROaNQDWAEEgAAQAAJAAAgEgQAInSBAhSyBABAAAkAACACB0CAAQic06gGsAAJAAAgAASAABIJAAIROEKBClkAACAABIAAEgEBoEAChExr1AFYAASAABIAAEAACQSAAQicIUCFLIAAEgAAQAAJAIDQIgNAJjXoAK4AAEAACQAAIAIEgEAChEwSokCUQAAJAAAgAASAQGgRA6IRGPYAVQAAIAAEgAASAQBAIgNAJAlTIEggAASAABIAAEAgNAiB0QqMewAogAASAABAIMgFJkgRBkiSJLEdRFFEUTYFkBDj+3gmA0PneaxDsBwJAAAgAgbEJDL55u23XwZ/nrjh8rO7x0xeKgpIMD4/O/C0xguX+/d+FZy+0yrI8dkYQ43sjAELne6sxsBcIAAEgAAQCJOBy8fEpeR03e660dqZlbIuI5JqaWx8+fhERGXOyrkkQhIdPXuRuLR4ZGQkwY4j+HRAAofMdVBKYCASAABAAAuMmoChKTV3T5dabag6iKGZvLIyI5Bg2+o/KM0a2vX0PauubJZjUMYiEywEInXCpSfADCAABIAAE7AjIssywHM+Lxkm3m1+bUxi1OPPBw2dGoCAI0YszZBnf0zJC4eD7JwBC5/uvQ/AACAABIAAEfBOQZfnQH7VZG3YbUVrbu+cvXM2wXEp6vqCvTe7uvVd6sApWJRuUwuYAhE7YVCU4AgSAABAAAvYE3G43w3IfPn6WJLm7515U7JoHD5/e7nvAsFHpawtudvd1dPbGJWTf739snx5Cv2cCIHS+59oD24EAEAACQMAZgRPV53+Zl/R/Py0qOVjT//CZgv/d73+6t/TYz7+nxizNbLx4DZ66csbyO4sFQuc7qzAwFwgAASAABMZHQJIkUZRMakaWZRH/G1+ekCr0CYDQCf06AguBABAAAkAACACBcRIAoTNOcJAMCAABIAAEgAAQCH0CIHRCv47AQiAABIAAEHBEQFEUnhfG/Ljd/Jhx1AiOSoVIoU0AhE5o1w9YBwSAABAAAo4JCILIsJzxwbsCer/i8GjjrN1BDBm4PCVXFL277zi2AiKGFgEQOqFVH2ANEAACQAAIjJuAJMkHjtRG6Frn0B+1t3rv9z989uLl4F9D7//+POxya/M9w8OjPC+4ed7N85+HR/sfPfvzSsehP05lrN81c16CJnciubaOnnEbYySUJHiFlgHjGxyA0PkG0KFIIAAEgAAQCBIBnhfSs3caEzPtN3oDKkhWFLfb3dl9N275eoblcvKLyC2VA8pKjczzPMNGTzCTcZQLSQwCIHQMFHAABIAAEBibgCRJivrm67Hjjj+GusvL+NNPIKWiKKYHsCeQ2SQndUj+fv/TGXO0WZmYJZkfPv49Dju+rONJy9rxrx8X9ROviRhHPpU156f/liAIwjjSji+JwyY6iRUdaFayjBq4Q++cx/SVIQgdX2QgHAgAgX8QAVEUeV4QRbTPisvtVg+sbwPgBYFhuara88FGsyZrx9mmFqsBwS5XlOXMdbtXZ2wJdkHjyF+WlcTUDXfuPR5ThymKUlvfrE/qROdtKRVFaRwlulz85u1le4orxjfWSrLscgvrN+09deZPl8s9ptn+LcR7/aD2KYqS2lYFQbQa5nKhJnroaJ3/3DweT3F51aGKUxNfhCTLyqnTF1YF0mZOn7108OipMS30eDyPnrxcEJv+eXhkIvRA6DhBDXGAABAIZwI9vfd/npfERHLow+r/s9E9fY9It928kJC6sfzISUEYz6hJZjXm8bPnAzPnJl+/0fM13zEpyXJtffMPkZzLxVsttG61Z40T7JDWti6G5Z4+fzVmQaIk5e/Ybyy1qaxptGqCMTPxeDyfP4/MX7hmYPAvJ5FNccoOnFyfV/RbVFr2xj3bCg6YVuoIgnC2seX8havq51zjlcbm1vaOnvaOnmvXu3leIKc8+h89i4jkvGurcUNdl7dvZNRFFup283Nj1tTWN5GBvo4FQUhds/mPyjMT0dOyrJysa46cFjs6OuqrIGv4mXOXDhyttYbbhvx5+frMOQn8BKbEQOjYgoVAIAAE/ikEZFn+76/Jl1u7Xr4anBez6sOHTwy7wOV284JA/oiUZXnf/uPJqze5XO6vg6bq5PkZsxPu3qPEVlCLfjXw5qdf4y9duWGUoiiKJEkuF3/5SmdC2oZrbd3GqW91cODwyeUrcpwMe0NvPyxalq1qnakzFnf33B2fzQLdEpxnoijKk6cv//Xjwtevh6wyq72jN2FlTkJqLsNGJ6/alJiak7gqN3FlTmJqHjs9LjEtt7a+ScQvHEWrjjK3jYzw55qu7is7/ufljpLyKpebdwsCma0giLuLK1asyXc+SdPVc/f//hPT0HjFuVOmmE+fv5g6Y8mfLR2mcP9fT5/907nQ8Xg85YdrVqzO5/lx3v4DoeO/OuAsEAAC4U+A59EMTdnBmuOVDd237i6KzyIljur/y4G/ZsxL7rjZ99VwyLK8cs3W/O37x3fbJVA7eZ4vPVC9Mb+I/H1/tunKomXp035NZFj0VHb7ZDyCFKhhpviiJKWs2VpV62iGpqX1pn4Di1uekjdKz3+Ycg7G18rqc3HL1/qqQVmWc/KLGDZ6ddYOnudFEd2KkiTp1cDrpLQ8ho3q7Xvg8XgkSRZFtDJsXszqGzf7Kk6c2bCp2GptW8ctho3q6g5AzymKUlJelZiaJwjjeYqe58UNm/bm5Bf5ctBqpBpy+mwAMzoej0cUxZXpW6udVbq1UBA6ViYQAgSAwD+OAJ7XWX7/4dOmi9cy1hWYhI4oiuVHauMSsoWvu6tKdW0Tw3K38WgX7Cp5+Oj59NkJV691kQW1d/a2dXT13ekvOVAVIkLH4/FUnDizaHmWm7e5v0Ya7/F4ZFkuKa80tM7OPYdJGWeKPOlfFcWzPm/vhk1Fso/13bKsxMWvZViuuLySLF2WlYTUPIblKmsajXuXfw+PzJidMDrqLi6v2rKjnIzv8XhcbnfG2oLktM2mpmuKZv3acbOXmRrVfv2W9dSYIa3t6E5iy9XOMWOaItQ3BDajY1S6k5k8U1kejweEjpUJhAABIPCPI9B0sS1u+QZBEJuaWxNX5ZhW4YyOurjFWVsKDnxlLi63e+r0uH2lxyXZ6SMq47NQkpStO8t/mZ9sO5Cg3/2hJHQeP33JsNzFlutOnP2yend19g5D65w+e9lJqkmJI4ryTz8vLztYLXtuQwkAABlNSURBVElyRCRnWqOjTlSw0+OYSK6z+w5ZoiCIsxekIKFz8rxxc6q6tiknv1SUpNID1aUHqsn4Ho/nVu99huWOHK83hY/5VZbl2QtSCosrApWAsiznbS2ZNS95HHeUxiF0Aqp0k9cgdExA4CsQAALhTECW5VGXq7PrzrPnr4zpGUmSElPzCvYeURTlwaNnU2csfTXwmqTQd6efieQu/NlGBhrHiqK4eb67517/w2cBdfp4yxb+/oOnL169ltA/87ZysqJs3l6WnbPHdnWwYcDED1wu1+wFKdsKDhrzB2SeX1Po8Lxw596j6zdukwtNFEVRb+uoVkmSHL0k88CRWodj870HT375fYWqdX6el/xgYo+LG2QEQXj34e+2670fPv5tyBF1GknFKAhS1OKsfSUnktO3Fu+vMhIaBw/6n2KrotxuavXJ9c5bDMt92fbwaps2wSZJ0uLEtWfOtaC5jcqzqzI3SxKlfStPNjIsN/D6jZE5eSBJktvNt3X0vHnz1nqbaUtB+ZrMHaY29mWpMvmiDJfLbbw0Q73P5XK5Zy9Iyd+x31ebcbv5e/efvHz1WpZlU9uub7hou0ZHEISXr163dfS43bzJTlGUYhZnlB91Wumk+yB0SBpwDASAQDgT4HmhpLx6xm8Je0r++O+vy7bvPqyOppIkNTRe5nmkMyRJKiqvNPXLp043R0RyDx8/t9Jxufmjx08zU7moxRnsjCUz5yaZBgxrEjVEFMWT9c3zY1bPX7g6IpI7+Mepdbl76xsumOIfrqhbnLjBj35Sd9xx8r+fmxpPnr2KiOQOV9g/k/x1hI6iKDe7++IS0K2cH6ZwJeWVhtZpv9HDsNyHj59UOIriyc7Zk5tf7HxlScP5FmNSZ1XWDiNnE23nX10u996SY+yMOIblliSuvdnVp+IVRXFp0sY9xRWq4nn4+PkPbHTVySZTi1ILqjmF1Mmy5ByeuA3ndvN4hXLU7qI/jFSiKG3cXKKa/e79h5q6JrI28Vqf4qnT42z9Gh11H6yoY1huafJGJjJ6TeZ2N72qt7q26df5KaP6KntFUf680rE6yzsNZqBTDzLX7ZJl+emzATSHVGEzh4Tadt2F+TGr5sWsiYjkDlWcWptDtW1bofPy1Ztpvyaw0+OSV23+Xzb6WNVZw30EU1HW5uzNzS+29dF/xYHQ8c8HzgIBIBAmBEZH3Zu371+2YsMgfgTmdl8/w0a1d9wif4vbuqoonl17jzJTuU9/f7ZGuHX7fuT02J7b9z0ez9NnAz+wnPW2gjWVKErlR04y7IIbN28rijI66uYWZzIs12yZNDpZf+G/vywdHvH57G7NqYsJKTmmz/IVG6mQlTkJK3OqTvrc/uf6jV4mkqs5Zf9YsiF0rl0P4lNXF/5si5weNzsqpXh/5Y+zFjMsd6yqQZZlURS5uMxzF66SGPeWHlu4JNP54mJBlHbt/cMYsIvLK0mhQObs5PjLDb7MdTuZSC49uyAlPZ9hubnRaap6uHz1RkJK7qh77EfzRFnesGlfBMvtLal4NfDG7eZv9/UfOX56UXzWrDlJR47VSbJ5hs+XbaIoLYpf9/P8FdYpLlmWG85d+vdPsX330eN7dacvMezCa/Si8qv4of3nLwfV/G909UVOiy0ur95XemL27yn7yk58echrZWZ+UdmJ4v2VxeVVaszLVzsZ1qbNiJJUfvgkw3J62+a5xRkMG3XmPJqOUv9ZhY7LxadmbN66cz/eLkjcsKn4hynRbje1DGtPSUVMIJWulwZrdAwScAAEgED4EhAEEQsL7qa+GGJkxMWwXMmBatMMuZWBIIgb8opmzklUn/UlIyiKUlxeOXNO4sgI2tCssqaBW5w5MDhExrE9brl644cp3K59R9WzoigmpG5iWO7VgPnWw4WL15hIbtD3Pi7PXw5ev9Fr+ly73m0KuX6j97nv7Wfart9C9+YuXrO11hA6Yz51pSgeJx9rKS6eX5K4oaS8Un2W+8OHv8sOViek5vK8cONm78IlWW5aOlSdbGQiuU+fbKSnNXM1hOeFlDVbDK0zjiW0aj6yLNfUNf0el9Fz+4EkSaIotrbfmjpryeUrnaIopqzJP1lnrxdNhomiOIdLZVgOPWS+Mjdh5caElbnxKblnzl1++/b9mPqbzM3l4n+dnxIVl25NJYrotuyC2MyREXQ3aHXGlozMnaY7ZQ8fP/+CRb1NJgjCv/6z6PyFVkFAW2hGTotzud0uN8+w3MMnz2W8pbFadG39hQjW3GYURWnBAmj3vj/UaKIoJaRuYqcvfjP01rC57ox5MfLDR08jWO7wsdN4O0T+/02JKS6rMs3YVdY0RrCBVbpaIszoGOThAAgAgbAl8Pnz8O+x6Zu3lxk/eT8PjzIstyZr+5i/7EVRXLEmf0niOutCXUnSHuqZ8VvCqqwdzZfahodHrIONCasgiAkpuQzLGXvkCII4ZVrs7AUpVi11rR0t1+jqDu5j7arQafWxTQ4SOuVjP3XF8/zG/KL8HfuNT8a63TaftbutiKprm3K3FJOiU5LkqLiMuw8eHao4daKm0cSwpfUGE8n19AbwKLXH4+m5fT9yGrrZxLDc4uXryRtGpvz9fJVleXnKxnv9j8k4NXUXMjfuvn3nUczSTNPwTEYjjwcHh1RLRl1uSZIFQcSPkY/nMW+3m//XT4viV2y0NmasoVFjm7dw9aZtpd237v09PEKagTZFHB5BN6GOoZtQeH5R24Sw5WpnasY2l4t//OTF1JlxLhe1OeEBPG1jajM8L5jaNs8LU6YtWp29k6zcutPmNTp//fVW3cIgeklmcXnl0PuPvGVbzpZWNIcUaKXDU1em6oavQAAIhCeBmlMXIiK588Ttj1OnLzBsVPnhk4b08eU5erpkW+mCRWusKsTj8Tx++jI1fSsesaIZNrrkQNWYGd68dZeJ5GbNTTJW3ly73s2w3PrcPaJovltx+eoN9Fz3uJ7+NXlklRdGhP6Hz5hI7vJV71aBxil18HMkdETpSltX+43e9hu91ztv+/q0d/Rab8qg5caWtdg7Cw/X1F1YvjLvo+VlVeearn5haChF0lo/xzKegVP3v96YX+RQkVgztNovCOKsecnFZSfKD5+0xrcNuXCxjWG55NX55EoU25ikRLCNIAjiPC4tbvla2ypuudo5LybVeKN7cXmVSQ/9NfSOYbmKyjNk5mjdz+ai4rJKRVHQdpGpOSahc/rsJYY1txnUtlnUtg22qG1HchUnzpCFnrIIHZ7nj51oYCK5CLxjU1zC+r+G3pH2eDyec02t5M8D01k/X2FGxw8cOAUEgEA4EJBlOX/7/i/7zw7rv2VlWV6Xs4dhuQvN9jdrSLdlWT5W1cBOjzOe0lLPyrJ878GTJ09fjrrdd+4+Liqr+iGSmx2VMuZ7qo9Vn2FYblXWdmNYqq1Hi50bzqGXWxnqRy2lofEKw3Jt13tIk8jj1rbu4vIqJ5+rPiZsPB7Phw9/R0zhzjZ6V1GQRTic0SGTTMrxyfrmnXuObC04IJNvQ8BZH66oj2C5l6/Md/rGLPfLOxMy1+9JXrXZGInHTOIkgiTJP89Nnv5rgu2KdWsOiqLs3HOEYbmjJyh5YY15rf3W/Jg1/tXzl5deZawtmB+TRooJdTV07+3+d+8/vH//qfPm3dwtZRFs9OrsHaZovXceMix3tpFaBfX0+asp02Iv422yT9Y3z4le6aLvHt66/QCnotqM2rbTiLZdWFzBsOj5efV1cqqDJqHz4ePfbR29PC8Ovv6r+VJ7/Aq0IL32tHVhfj3DRr8aCPh1HCB0rO0KQoAAEAgrAoIgLli0Jiou3XheA22SxkYtX5lnUhW2bsuycvXaTYaNHnr7gYzw9Nkrho2KiIxSByF1IcsS9ISUdvfBzfMl5VXWIqpOno9go4vLtD3iBEHatH0/O2Px27cfKiobEtM2keNQVe15huWu+Z7Rud//tLG51fh8Wf3j63PvwRPSfvIYPyq88oSP1crfSuioy11tFwZt33WQYbmREepmCumR7bEgiNt2HohfkWOtFNv4zgNlWY5enJFu2WrSVw744awNTCTXd/ehrzjqXFrpwerMDbv9r0sWRan0YM2M3xJNcz8dN29HsFxi2kZ10mhkdDQpLW/3vqNkA/N4PM1/ormlNqKNKYqy/1ANw3KDr5GqKN5f9e//Lhp8TS0++/DxbyYyuvIkdVcRrZ1iudJybZsfdCcL73zo5vmyg9WrswpUZ0mhIwhi8qrNEZFcif4E/rX2boaNvvBnu4kMqvSpAVc63LoyYYSvQAAIhCGB7h40nR45PU4d3t6+/xi3fN1cLm101GWZKbB3/9nzAYaNVvfjN2Lce/D4h0iusOiYKnRGRkbmRq9qOK+9NkiQpI1od39u8/Yy0z0v3I9Hbdt9UN0yrho9Yxwdt3ztyKgrdc3W3tto13/j34EjtV8eDH734aMREowDt5tPWpVXftj8nkVZlnlBGHW71f2Fr167qW6mEgwbrHmOjrpjE7J5y27UX96TsCZ7x9KkDYZytaa1hoiSVFR6ImZppulZHmvMcYQIgrR8Zc6pM5fGTKsoCs/znz59VhfofPo07MsLUZRGXa5FyzKPVzX4n39SFOVs45Up02JNrp1uaGHY6OZLaP8nWZZvdt+ZNju+/+FTk5EVlWcYNurt2/dG+AjaITMjNX2r2raLyysjWO5ENfXUntuNHqcytRnctrltu7S2jZ+fj14Unz066po6fbEhtUmh4+KFmXOSMjfs+esvtFpZFMWjx+p/+T3F5Issy+lqpeP3fxmmOjmAGR0nlCAOEAAC3zGB4v1VP7Dc3tLjDMtlrEVPBW8pOODngW2rq6Mu1y+/J9efuUiecrn4pcnr0jK2ZmTviF6KHg4/cqzemM6RJClj/W60CDRmlanLliRpb+mx5SvzUtPzmUhu5+5DZ5taItio9Mwd6dk7yRUbX3RGWsa2hY7Xt5LmBXQsSVLe1tLVWeabGq1t3ctXbsjeWKiOygwb9eVX/lewRzX+6rXOTdtKjRt8hkeiKM6ck7i1oNw0M2FEsB5IklRReebnecnv3lHTctaY4wt5/+HjT7/EP3+hPaHtJ5M9JcfnRq/GPGPQ8tvI6PmxmaaduNW5nISVuZu2l07/LWFvccXe0hO2+/IZBfXcvs+w0U+fUe91Hx11/b4wPS1za9aGXfOi0/7vx4WNF1pN0L7gzVy/K5puY8X7kbI536zdzLqPNzZcle692YoViZS1cc8qejm/JMllh6qT0vJS07fgtn143/4TuG1vn/37SmPCiRQ6siyfbby6aGl25rrdqzO2RcWlb999yHSbTBVAswKsdAMOCB0DBRwAASAQhgTQD8G1OxmWu3P30aMnL843XXn79oO1G/XvuSTJaVnbN+aXmAZdQRAbL7aeOXep/UbPm6Eh0xAiSVJr+60INspQP0YpkiSNjrovtXQMvh4S8SPKT56+unjpukkSvXz1+n/xPitGwuAdFJWciGC5AXpL6JvdfY+fvnDzgpsXhkdG1YO+Ow9NHIJhlSiKW3ceqK1vtmbe0XmbYbnGi/YbVVvjK4rScK4lclpcV8/dIFne1tHzi7OXIfA8PzLiGh11fR4ecbv50VGXr9eqK4pysaVj6swlX5bgjGn2588js+YlVddSN5I8Hs/7j5/ON7c2Xmztuf3AtIOzCkqUpP/8vGxPSYWhQjwez8Dgm0dPXxghkiS1d/QMDZk1YlHpiQg22tRmvjxjPzrqvnzlBmrbInr8HrftdvIZN1LoqJrp2fOBc01XLrVcf/zkxZf3dln9xZW+oOnPsRfVWRsACB0rEwgBAkAgfAgIgjiXS/09NoPsZ8fhXtPFtl/mJTvfoU4t4srVzulzVwgCtcG/89LL0TqJqHv95nsNznNwHnPwzdDMuUnHqhqcJwlqTDcvTPttuXGzwyhLUZQ9JRWx8Wv9380h47e2df04a0nDuUvW4dOINpEDtP3j4ZNZG3b7XzIcaBGyLB85Vr9t10EnZuPXl1ZnbdhtVdX+y/3zUjvDxty5h7YTDPTf6zdvZ85NOl59NtCEJqEzZnK10uOWO610U4YgdExA4CsQAAJhReAOfqIkJ7/YyWjhx3NBEJen5FTXUssU/MT3eDyCIP3GraoY67EaX5n8/Xk4ZmlWcVmV9ZlzX0kmEq5ufrgoPts0qzSRPCeStv1G70+/xI9YtoT+66930Usyqk+Zpy58lXXn7sM5UalHj9ePT4UoitLYfPXiZfPCWLI4tDFSat6xqoDHezIT67EkSSvTt5w6jea0TJOF1sgej2fw9dCsuUkBbYQoy3JufglaRmZZCGVbhClQluXi8srY+GyHotNIHqjQQZW+NIBKNwpSD0DomIDAVyAABMKKgLqKtqbO5g5IoH42Xmz997RFzp/0OVF9fuGyzPE94KMoyt6SY0mrNgf6Az1Qp8j4giAuTlxfc6pxgqKQzHPcx5U15zPX7zYNwGhzly3F1nBfpbx89WZxwrrC4grTenBf8a3hnd19s+YmmvZlNkX78vaG6b8t7+m9Zwqf4Ncn6CXtUR2dtwVBSEpbP2alyIpy6Oip+QtXu/S3Vo1pQEPjld9j0x88fjZmTF8ReF5YnLg20DYTkNDBlV7qvNKtpoLQsTKBECAABMKHwL37j5PX5L97NwlPLcmKcqyyISUj3+FdMFEUnfwQt2Xd3tE9N2bNnXvU9ru2MSc30OXi//WfReqruyY350BzO9d05UrrTVOqE9XnsjbsdjiQDw+PpmVsz96w21hrYsptzK+fPn1Oz9pevL/SukMgmRYvJyo3aTIywviOHzx89vO8FccqGxbErbne2eskE0mS8neWr8srdNLwhobezV+Yfruvf0wJ5b/o0VH3v35cZHpa0H+SgITOiepzmesLAl1XRxoAQoekAcdAAAiEGwFZVvw/rhKQw7IsV51s9PPmqYBy8xN5aXKmyzXOlT1+snVy6vWbd0Vlx53EDGoc9X3sZBHqLJfDO2uiKOXkFy9clj06Ovb7NclS1GNFUR49ebUseeO//xv74OHYEx4T1ApWA/DtKuXMuUs/oJceBDBXxPN8UenxO36351GLK9lfua3w8KRYPjT0/lhNAHfuum71+dqG24RCUZTc/CL/M2qmJNavIHSsTCAECAABIAAEvmMCoijtLvqDnR7n5AWrpJ9o3yCev//gyf5DNdFL0JYBa3P2uuh3aJPx4fi7IABC57uoJjASCAABIAAEHBGQZfnA0VMRkVxLa2dbR4/2uX7r2vVudHz9VltHT3tHb1sHeidXW0fP2cYr5UdqC/Ye3phftDprBxeXrm8ahN79edLu+XZHdkCkkCEAQidkqgIMAQJAAAgAgYkRUBSltv4Cw3IRkUim+P1E4f0P0RaIvj7/92PM+NaST8wJSD3JBEDoTDJQyA4IAAEgAAS+FQFBEJYlZ8en5KDPCvRZnpLr65OwMtfXJzE1LzE1Lze/ZHwPpX8r96FcWwIgdGyxQCAQAAJAAAh8lwS+7IU95j9JksaMo0b4LhGA0TQBEDo0D/gGBIAAEAACQAAIhBEBEDphVJngChAAAkAACEyMAN6PAM3mTCwbSB1CBEDohFBlgClAAAgAASAwiQQEQXCyC45RotstZG3czURys+Ym990Zz+ufjKzgIHQIgNAJnboAS4AAEAACQGAyCRTvr07N2ORwTzxJkrM27m6+hN6zfenKjZ/nJg8Pj0ymNZDXNyIAQucbgYdigQAQAAJAIJgEBEFMSsupOdXksJDXb4YYNsrtRhtSi6KYtCqv6eI1h2khWigTAKETyrUDtgEBIAAEgEDABGRZPtvUcvlKJxPJXb7aefnKDSeTOi1XOxmWEwRJfVt4Ylpe2cFqJwkDtg8SfF0CIHS+Lm8oDQgAASAABIJMQJblHYWH1+ftmTknsaS8uqS8Upblto5b1zt7bD+t17skSTpZh3YalCS0DFlRlMS0vNytxbAqOch19TWyB6HzNShDGUAACAABIPA1CSiKUrD3SFrGNmNKpq3dt9Bp7xZFqexgNcNy6qvOJUlKTNuUmLoJhM7XrLUglQVCJ0hgIVsgAASAABD4ZgR4XkxMzXP+GnZFUUoPVBkzOrIsJ6Ztghmdb1Z/k1owCJ1JxQmZAQEgAASAQAgQGBp6z0RyV6/dVBRFfY1DUVll2YFq20/JgSpZVvCMTrQgiB6PB83opKI1OrKihIA3YMKECIDQmRA+SAwEgAAQAAIhSKC6tunHWUs+fvpcVHbi6LEziqI0Nl9ruog+V1o7TZ+Wq52Koty7/4RhuZevXuOnrqSo2DW37zwMQdfApEAJgNAJlBjEBwJAAAgAgVAn0Nt7n1ucdfDoqfWbihyus1EUT1F55dLkdZ+HXRnZO7M37nGYMNRZ/OPtA6Hzj28CAAAIAAEgEHYEFMVz4+bt/ofPAnr9uCRJR4+fiYiMLimvFgS0oQ78CwMCIHTCoBLBBSAABIAAEAACQMCeAAgdey4QCgSAABAAAkAACIQBARA6YVCJ4AIQAAJAAAgAASBgTwCEjj0XCAUCQAAIAAEgAATCgAAInTCoRHABCAABIAAEgAAQsCcAQseeC4QCASAABIAAEAACYUAAhE4YVCK4AASAABAAAkAACNgTAKFjzwVCgQAQAAJAAAgAgTAg8P8BV+JU2uH9T+YAAAAASUVORK5CYII="
    }
   },
   "cell_type": "markdown",
   "id": "ac073e6b",
   "metadata": {},
   "source": [
    "![image.png](attachment:image.png)\n",
    "### 符号含义\n",
    "- $\\rho^{\\pi}(s, a)$：表示在策略$\\pi$下，状态 - 动作对$(s, a)$的占用度量 。它衡量了在策略$\\pi$的作用下，智能体在状态s执行动作a的长期期望频率 。\n",
    "- $\\gamma$：折扣因子，取值范围通常在$[0, 1]$ 。它的作用是对未来的奖励或状态 - 动作出现的频率进行折扣。$\\gamma$越接近 0 ，表示智能体越关注当前的状态和动作；$\\gamma$越接近 1 ，表示智能体对未来的状态和动作给予更高的权重。\n",
    "- t：时间步，从 0 开始计数，代表智能体与环境交互的不同时间阶段。\n",
    "- $P_{t}^{\\pi}(s)$：在策略$\\pi$下，在时间步t处于状态s的概率 。它反映了在策略$\\pi$的引导下，随着时间推移，智能体到达状态s的可能性大小。\n",
    "- $\\pi(a|s)$：策略$\\pi$定义的在状态s下选择动作a的概率 。它体现了策略$\\pi$在特定状态下对不同动作的选择偏好。\n",
    "\n",
    "### 公式整体理解\n",
    "公式$\\rho^{\\pi}(s, a) = (1 - \\gamma)\\sum_{t = 0}^{\\infty}\\gamma^{t}P_{t}^{\\pi}(s)\\pi(a|s)$ 计算的是在策略$\\pi$下，状态 - 动作对$(s, a)$的长期期望频率。具体来说：\n",
    "- 求和部分$\\sum_{t = 0}^{\\infty}\\gamma^{t}P_{t}^{\\pi}(s)\\pi(a|s)$：对每个时间步t，计算在该时间步处于状态s的概率$P_{t}^{\\pi}(s)$ ，乘以在状态s下选择动作a的概率$\\pi(a|s)$ ，再乘以折扣因子$\\gamma$的t次幂$\\gamma^{t}$ ，然后将所有时间步的结果累加起来。这一步综合考虑了在不同时间步，智能体处于状态s并选择动作a的情况，同时通过$\\gamma^{t}$对未来时间步的情况进行了折扣处理。\n",
    "- $(1 - \\gamma)$：这是一个缩放因子。引入它主要是为了使占用度量满足一些特定的性质，例如使得所有状态 - 动作对的占用度量之和为 1 ，从而让占用度量具有合理的概率分布性质。 它对求和结果进行调整，使得最终得到的$\\rho^{\\pi}(s, a)$能够在整体上合理地反映状态 - 动作对$(s, a)$在策略$\\pi$下的长期期望频率。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "32903ba4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.10395776401475135 0.24711299717563512\n"
     ]
    }
   ],
   "source": [
    "def occupancy(episodes, s, a, timestep_max, gamma):\n",
    "    ''' 计算状态动作对（s,a）出现的频率,以此来估算策略的占用度量 \n",
    "    occupancy 函数接收 5 个参数：\n",
    "        episodes：一系列的情节，每个情节是由状态 - 动作 - 奖励 - 下一个状态组成的元组序列。\n",
    "        s：目标状态。\n",
    "        a：目标动作。\n",
    "        timestep_max：最大时间步数。\n",
    "        gamma：折扣因子。\n",
    "    '''\n",
    "    # rho：用于存储最终计算得到的占用度量值，初始化为 0。\n",
    "    rho = 0\n",
    "    # 记录每个时间步t各被经历过几次\n",
    "    # total_times：是一个长度为 timestep_max 的数组，用于记录每个时间步被经历的总次数，初始值都为 0。\n",
    "    total_times = np.zeros(timestep_max)\n",
    "    # 记录(s_t,a_t)=(s,a)的次数\n",
    "    # occur_times：同样是长度为 timestep_max 的数组，用于记录在每个时间步中，状态 - 动作对 (s, a) 出现的次数，初始值都为 0。\n",
    "    occur_times = np.zeros(timestep_max)\n",
    "    # 这部分代码通过两层循环遍历所有情节和每个情节中的每个时间步。\n",
    "    for episode in episodes:\n",
    "        for i in range(len(episode)):\n",
    "            # 对于每个时间步，从情节中提取状态 s_opt、动作 a_opt、奖励 r 和下一个状态 s_next。\n",
    "            (s_opt, a_opt, r, s_next) = episode[i]\n",
    "            # 每次经历一个时间步，total_times 数组中对应时间步的计数加 1。\n",
    "            total_times[i] += 1\n",
    "            # 如果当前时间步的状态和动作与目标状态 s 和目标动作 a 相同，则 occur_times 数组中对应时间步的计数加 1。\n",
    "            if s == s_opt and a == a_opt:\n",
    "                occur_times[i] += 1\n",
    "    # 这部分代码从最大时间步开始反向遍历 total_times 数组。\n",
    "    for i in reversed(range(timestep_max)):\n",
    "        # 对于每个时间步，如果该时间步被经历过（即 total_times[i] 不为 0），则计算该时间步状态 - 动作对 (s, a) 出现的\n",
    "        # 频率（occur_times[i] / total_times[i]），并乘以折扣因子的 i 次幂（gamma**i），然后累加到 rho 中。\n",
    "        if total_times[i]:\n",
    "            rho += gamma**i * occur_times[i] / total_times[i]\n",
    "    return (1 - gamma) * rho\n",
    "\n",
    "gamma = 0.5\n",
    "timestep_max = 1000\n",
    "\n",
    "episodes_1 = sample(MDP, Pi_1, timestep_max, 1000)\n",
    "episodes_2 = sample(MDP, Pi_2, timestep_max, 1000)\n",
    "rho_1 = occupancy(episodes_1, \"s4\", \"概率前往\", timestep_max, gamma)\n",
    "rho_2 = occupancy(episodes_2, \"s4\", \"概率前往\", timestep_max, gamma)\n",
    "print(rho_1, rho_2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd32fab9",
   "metadata": {},
   "source": [
    "## 强化学习中占用度量计算和贝尔曼方程计算有以下区别：\n",
    "### 定义与目的\n",
    "- 占用度量计算：主要用于描述智能体在环境中访问各个状态的长期频率或概率分布。通过计算占用度量，可以了解智能体在不同状态下花费的时间比例，进而分析其行为模式，评估策略的性能，为策略优化提供依据。\n",
    "- 贝尔曼方程计算：是强化学习的核心方程，用于计算状态价值函数或动作价值函数。它基于动态规划的思想，通过递归的方式将当前状态的价值与未来状态的价值联系起来，目的是找到最优价值函数，从而确定最优策略。\n",
    "### 计算内容\n",
    "- 占用度量计算：关注的是状态空间上的概率分布。对于一个给定的策略，它计算的是智能体在每个状态下被访问的概率。例如，在一个网格世界环境中，占用度量可以告诉我们智能体在每个网格位置出现的长期概率。\n",
    "- 贝尔曼方程计算：计算的是状态或动作的价值。状态价值函数表示从某个状态开始，遵循最优策略所能获得的长期累积奖励的期望；动作价值函数则表示在某个状态下执行某个动作，然后遵循最优策略所能获得的长期累积奖励的期望。例如，在玩扑克牌游戏时，贝尔曼方程计算可以帮助确定在当前手牌和游戏状态下，采取不同行动（如跟注、加注、弃牌等）的价值。\n",
    "#### 计算方法\n",
    "- 占用度量计算：通常需要根据环境的动态和策略来推导。对于马尔可夫决策过程（MDP），可以通过求解一系列与状态转移概率和策略相关的方程来得到占用度量。在一些简单的情况下，可以通过直接计算状态转移矩阵和策略向量的乘积来逐步计算占用度量。但在复杂环境中，可能需要使用更高级的算法，如基于迭代的方法或利用线性代数的技巧来求解。\n",
    "- 贝尔曼方程计算：有多种求解方法，常见的有动态规划算法（如策略迭代、价值迭代）、基于模型的强化学习算法（如 Dyna 系列算法）以及无模型的强化学习算法（如 Q-learning、Sarsa 等）。这些算法通过不断地更新价值函数估计，使其逐渐收敛到最优价值函数。例如，价值迭代算法通过反复更新每个状态的价值，直到价值函数收敛；Q - learning 则通过在环境中进行采样，利用贝尔曼最优性方程来更新动作价值函数。\n",
    "### 对环境模型的依赖程度\n",
    "- 占用度量计算：一般需要明确知道环境的状态转移概率等模型信息。因为要准确计算智能体在不同状态之间的转移频率，必须依据环境的动态特性。如果环境模型未知或不准确，会对占用度量的计算结果产生较大影响。\n",
    "- 贝尔曼方程计算：在基于模型的强化学习中，需要环境模型来计算状态转移和奖励，从而求解贝尔曼方程。但在无模型的强化学习中，不需要显式的环境模型，而是通过与环境的交互采样来估计价值函数，利用观察到的状态、动作和奖励信息来更新价值估计，这种方式在环境模型未知或难以获取时具有优势。\n",
    "### 收敛性和稳定性\n",
    "- 占用度量计算：其收敛性取决于环境的复杂性、策略的性质以及所使用的计算方法。对于一些复杂的非马尔可夫环境或非线性策略，计算占用度量可能会面临收敛困难的问题，并且结果可能对初始条件较为敏感。\n",
    "- 贝尔曼方程计算：不同的求解算法具有不同的收敛性和稳定性。例如，价值迭代算法在有限的状态空间和离散的动作空间中通常是收敛的，但收敛速度可能较慢；Q - learning 等无模型算法在一定条件下也能收敛到最优价值函数，但可能需要大量的采样和迭代，并且在非平稳环境中可能会出现收敛问题。此外，一些算法可能会受到初始值选择和探索 - 利用平衡的影响，导致结果的稳定性有所不同。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
